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Recent experiments show that a substantial energy gap in graphene can be induced via patterned 
hydrogenation on an iridium substrate. Here, we show that the energy gap is roughly proportional to 
VNh/Nc when disorder is accounted for, where Nh and Nc denote concentration of hydrogen and 
carbon atoms, respectively. The dispersion relation, obtained through calculation of the momentum- 
energy resolved density of states, is shown to agree with previous angle-resolved photoemission 
spectroscopy results. Simulations of electronic transport in finite size samples also reveal a similar 
transport gap, up to 1 eV within experimentally achievable ^/Nh/Nc value. 



Introduction: Graphene's novel electronic and physical properties makes it an interesting material for various 
potential applications [1|[2- For electronics, the absence of an energy gap limits its applicability. Currently, there 
are myriad known ways to opening an energy gap in graphene. For instance, the patterning of graphene nanoribbons 
induces a band gap due to the confinement of carriers along the transverse direction [3 . Although there are recent 
proposals in achieving controlled width and smooth edges [4 , the large scale fabrication of nanoribbons remains a 
challenge. The graphene nanomesh, also known as a graphene antidot lattice, is a viable alternative [5, 6 . Here, the 
confinement potential is created by clusters of vacancies, i.e. nanoholes, arranged on a regular superlattice. They 
are prepared using block copolymer lithography. Two structural parameters, the cluster size and neck width of the 
superlattice of nanoholes, govern the electronic transport properties observed in experiments [6 . A larger energy gap 
is induced when both these two parameters are reduced. 

Another variant of nanomesh, formed by periodic pattern of hydrogen clusters, has recently been observed for 
graphene grown on an iridium substrate [7 . The periodicity is due to the fact that the composite structure of 
graphene and iridium forms a superlattice, with the hydrogenation occuring preferentially on specific superlattice 
sites. The resulting structure can be regarded as a variant of the nanomesh since regions of hydrogenated graphene 
are highly insulating jSHTO] . Nanomesh via patterned hydrogenation is a promising approach since its cluster size and 
neck width can be much smaller than the lithographically defined case. Indeed, the opening of a substantial energy 
gap has been revealed by angle-resolved photoemission spectroscopy (ARPES) \Z,. 

Theoretical investigations of the electronic properties of nanomesh/pattern- hydrogenated graphene have been 
limited so far to band-structure calculations using primitive supercells |5l [71 [Til [H] • This approach can treat disorder 
in the cluster shape only within the same supercell [7 . Instead, in this work, we present a modeling study of 
pattern-hydrogenated graphene that also includes disorder across different supercells. Through calculations of the 
momentum-energy resolved density of states and its electrical conduction, we study the scaling of the energy gap on 
the parameters defining the patterned hydrogenation, i.e. cluster size, filling factor and neck width. 

The model: A simple tight-binding model is employed to describe the composite structure of graphene with 
adsorbed hydrogen atoms [13l [14]. Within this model, the basis consists of a 2pz orbital per carbon atom and a 
Is orbital per hydrogen atom: the parameters describing the carbon-carbon hopping integral (7 = 2.6 eV), carbon- 
hydrogen hopping integral {jh = 5.72 eV), and hydrogen onsite energy {en = eV) are taken from Ref. 14, Such a 
minimal model captures the essential physics of the hydrogenation effect pertinent for our study, that is the removal 
of the Pz orbital of the hydrogenated carbon atom from the tt and tt* bands [34 . Since our purpose is to study the 
intrinsic properties of the nanomesh, we neglect the interaction with the iridium substrate in the TB model. As a 
consequence, particle- hole symmetry is preserved and the neutrality point remains located at the Dirac point. 

Graphene on an iridium substrate forms a superlattice due to the mismatch between their respective lattice 
constants; 10x10 graphene unit cells are commensurate with 9x9 iridium unit cells [15]. The superlattice unit cell is 
represented in Fig. [l^. The supercell preserves the symmetry of the graphene unit cell [35 , resulting in a honeycomb 
superlattice. Experimentally [7 , it was shown that the hydrogen clusters tend to form in regions indicated by circles 
in Fig. [1^, where one graphene sublattice sits directly on top of iridium atoms. A hydrogenation model is developed 
to reproduce this preferential adsorption (see details in Supp. info). The model takes as input two parameters: a 
discrete quantity which represents the cluster radius, and the filling factor Uc i.e. ratio between the number 



* E-mail: rgrassi@arces.unibo.it 

t Current address: IBM T.J. Watson Research Center, Yorktown Heights, New York 10598, USA 



simulation 




experiment 



-0.2 0.2 



-0.2 0.2 





0.2 0.2 



FIG. 1: Schematic representation of the atomic structure under study and comparison of the simulated /c-resolved density 
of states in energy with the experimental ARPES. (a) Top view of the supercell of graphene on iridium substrate. The two 
graphene sublattices are indicated with different colors. The supercell is symmetric under reflection across the dashed line, 
except for the interchange of the two graphene sublattices. The two circles highlight the regions of the supercell where the 
clusters tend to form, (b) Top view of two hydrogenated samples with different cluster concentration. Hydrogen atoms are 
represented as black dots on the honeycomb graphene lattice and the iridium substrate is not shown. SI is obtained with the 
model parameters Nw — Uc — 0.75, while S2 with = 4, ric = 1. (c) Calculated momentum-energy resolved density of 
states for two sets of hydrogenated samples that correspond to the cases SI and S2 shown in (b). 50 samples are considered for 
each set, the plotted quantity being the average. The inset shows the direction within the graphene Brillouin zone (red line) 
along which the calculation is performed, (d) Experimental ARPES intensity for different times of exposure to hydrogen (as 
indicated in the labels), reprinted by permission from Macmillan Publishers Ltd: Nature Materials| 9, 315, copyright 2010. 



of clusters and the number of half-supercells. Two types of disorder are considered: i) irregular cluster edges and 
a) a random filling of the superlattice (if ric < 1). See Fig. for illustrations. Each cluster is generated by 
adding hydrogen atoms on top of the carbon atoms belonging to a certain number of shells around the center of the 
half-supercell, and the edge disorder is introduced by partial hydrogenation of the outer shell. Both A and B sites 
are hydrogenated within each cluster, contrary to what is expected in the real structure [7 . The additional hydro- 
gen atoms play the role of the neglected interactions with the substrate in removing the pseudo dangling bonds [71 116] . 

Key features in momentum-energy resolved density of states: In order to study the electronic properties 
of pattern- hydrogenated graphene as seen in ARPES experiments, the calculation of the momentum-energy resolved 
density of states is required. This quantity is given, apart from a normalization factor, by the diagonal elements of 
the spectral function in momentum space, 74(k, k;£^). While this quantity reduces to the usual band structure for 
periodic systems, it is a general concept and is valid even for disordered systems. The calculation is performed by 
first computing the spectral function in real space A{y^y']E) and then Fourier transforming to get A(k, k;£^) [36j. 
The calculation is done using the Green's function formalism with a recursive algorithm for periodic structures. See 
Supp. info for detailed numerical description and implementation. 

In Fig. [l]^, we plot the averaged A(k, k; E) for two ensembles corresponding to the two realizations shown in 
Fig. [l]3, along a path in /c-space that includes the K point [37 . The convergence of the result with respect to 
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FIG. 2: Band-gap extraction, (a) Momentum-energy resolved density of states for two sets of hydrogenated samples on iridium 
substrate: S3 is obtained with the model parameters Nw = 2, ric = 1, while S4 with = 3, ric = 1. Different fitting curves are 
used (white lines), given by Eqs. 31 and 29 of Supp. info, for S3 and S4, respectively. The band gap is extracted with respect 
to the fitting curve, (b) Band gap extracted for the various sets of samples and plotted as a function of V Nh / Nc , where Nh 
and Nc are the average number of hydrogen and carbon atoms in the half-supercell, respectively. The fitting functions used 
for the extraction are listed, for each set, in Table 1 of Supp. info. SLIO stands for graphene on iridium substrate (supercell 
made of 10x10 graphene unit cells, see Fig. [TJi), while SL13 refers to graphene on a fictitious substrate (supercell made of 
13x13 graphene unit cells). The error bar is a measure of the broadening of the A(k, k; E) plot. The dashed line is the curve 
Eg — 2/ii;i?7r/A, with vp — (3/2)acc|7|/^ the Fermi velocity in pristine graphene and A = yj As/2 {Nc / VNh) , where acc is 
the carbon-carbon distance and As = acc3v^/2 the area of the unit cell of pristine graphene. 



sample size and ensemble size has been checked, as reported in Supp. info. Only the negative energies are shown, 
as the conduction bands are symmetrical to the valence ones due to particle-hole symmetry. The corresponding 
experimental ARPES image [7] corresponding to two different hydrogen doses is shown in Fig. [l]i. Several distinctive 
features are observed in both simulations and experiments. In both cases, the two valence branches intersect 
at a lower energy than the Dirac point. In addition, the signal of the states lying at the K point between 
E = and the intersection energy gets suppressed with increasing hydrogen doping. These features can be 
interpreted as a band-gap opening. The presence of a flat band at = in the simulation results is a well-known 
effect, due to the imbalance between the two graphene sublattices [17]. The absence of these states in the exper- 
imental ARPES could be related to bond relaxation and sp^ hybridization [18], which are neglected in the simulations. 

Scaling of energy gap: The energy gap is extracted from the momentum-energy resolved density of states 
for different sets of samples, corresponding to different values of cluster size, filling factor, and supercell size. 
The supercell size is increased by considering a fictitious substrate other than iridium: for fixed cluster size, this 
corresponds to increasing the neck width. Fig. [2|l illustrates the fitted band edges from 74(k, k;£^), with details 
in Supp. info. An apparent universal scaling relation for the band gap is obtained when we plot the extracted 
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FIG. 3: Transport simulations, (a) Conceptual device under investigation: pattern- hydrogenated graphene is transferred to an 
insulating substrate and used as channel material of a field-effect transistor, (b) Profile of the potential energy used to simulate 
the structure in (a): the Fermi level in the leads Ef is kept fixed, while the barrier height Vch is varied to reproduce the effect 
of the back gate. Pristine graphene is used for the leads, (c) Zero-temperature conductance vs. V^ch for various sets of samples 
with = L = 30 nm and iridium substrate (SLIO). From left to right, the cluster size, i.e. N^, is increased; within the same 
plot, the cluster concentration ric is varied. 100 samples are considered for each set and the average is done on the logarithm 
of the normalized conductance (a motivation for this type of averaging can be found in [22 ). The vertical lines indicate the 
band gap from Fig. [2]d. 

band-gap values Eg (together with a measure of the broadening of each 74(k, k; E) plot as error bar) against the 
quantity \/Nh /Nq^ where Nh and Nq are the average number of hydrogen and carbon atoms in the half-supercell 
(Fig. [2]3) [38 . A similar relation also applies for the case of triangular graphene nanomesh [5j. In Ref. 11, it was 
stated that a universal relation does not hold for honeycomb graphene nanomesh. However, Fig. suggests that 
when disorder is included in the simulations, the scaling law Eg = c\/Nh/Nc^ with c a constant, can be valid at 
low defect coverage for honeycomb superlattices as well. This is similar to the case of graphene nanoribbons, where 
theoretically the band gap depends on the precise atomic configuration [19], while a general law Eg oc with W 

the ribbon width, is always observed in the experiments [3] and commonly attributed to disorder [5ol[2T]. Regarding 
the proportionality constant, we found that the expression c = 2hvFT^ j ^ Asl2^ with vf the graphene Fermi velocity 
and As the area of the graphene unit cell, fits fairly well the numerical data (dashed line in Fig. |2]3). We note that, 
by defining A = ^J As/2 [Nc/\/^h)^ the scaling relation takes the form Eg = 2hvF7r/A. This equation can be 
thought of as arising from the quantization of the graphene dispersion relation E = ±fiVi7|k| (k is here the wave 
vector around the K point) with |k| = tt/A (ID quantization in random directions): A can thus be interpreted as 
an effective confinement length. 

Transport gap: Next, we examine the electronic transport properties of pattern- hydrogenated graphene. Tech- 
niques for the transfer of graphene grown on metal surfaces to an insulating substrate have recently been developed 
[23] . We consider a three-terminal structure as shown in Fig. [3|i and aim at predicting its low-temperature, low-bias 
conductance. Fig. ^jp illustrates the potential energy along the device. The potential energy in the source and drain 
leads, as a result of metal induced doping, is kept fixed with respect to the Fermi level Ep. The channel potential 
Vch is modulated by the back gate. Graphene is aligned with its armchair direction along the longitudinal direction 
of the device, in order to avoid edge transport effects. Only the channel is hydrogenated while the leads are pristine 
graphene. The conductance is computed by using the standard Green's function technique [24] combined with a mod- 
ified version of the algorithm described in Ref. 25 , which is commonly used for the calculation of the lead self-energies 
(see Supp. info). 

Fig. (Sj) shows the simulated, ensemble averaged zero-temperature conductance G vs. Vch- The device size 
is kept fixed at = L = 30 nm, while different sets of hydrogenated samples are considered. It can be seen 
that patterned hydrogenation leads to a clear transport gap, increasing with N^j and ric. Also, the transport 
simulations agree well with our band-structure results: the transport gap matches the band gap from the 
A(k, k; E) fitting (as indicated by vertical lines in Fig. ^) and the peaks in the transport gap region correspond to 
the gap states in A(k, k; £^) [39 . The G vs. Vch curve appears symmetrical, unlike the case for pristine graphene 
[26] . This suggests that scattering is dominated by the channel, instead of the tunneling resistance due to pn junctions. 




FIG. 4: Decay length extraction, (a) Decay length vs. Vch for sets of samples with different cluster size and fixed supercell size 
(SLIO) and cluster concentration (uc — 1). The vertical lines indicate the band gap from Fig. [2]3. (b) Example of the decay 
length extraction at two different V^ch points. The dashed lines indicate the fitting with the formula ln[G'/ (2e^//i)] = In qq — L/^. 
(c) Average value of the decay length in the "off" and "on" state for various sets of samples with different supercell and cluster 
sizes, plotted as a function of ^/Nh/Nc- The dashed lines indicate the fitting curve ^ oc Nc / V Nh • The off state is defined as 
the bias region iKh — < Eg/'^ — B, where B is the half-broadening from Fig.jSj), while the on state as 0.65 eV < |Kh — < 
0.75 eV. 



Scaling of transport coefficients: Finally, we examine how G scales with L. Here, we consider devices with 
filling factor Uc = I. The conductance is found to scale as G ex exp(— L/^), where ^ is a decay length. Fig. [4^ plots 
the extracted ^ as function of Vch bias, while Fig.|4]3 illustrates the extraction of ^ for two particular V^ch values. Next, 
we extract the average value ^ of the decay length in the "off" and "on" state and plot it against \/Nh /Nc as we have 
done previously for the band gap (Fig. [4]:), see caption for the definition of the "on" and "off" states). One observes 
that, for almost all the samples, the value of the decay length in the "off" state is about an order of magnitude smaller 
than the corresponding value in the "on" state. For both cases, the average decay length seems to follow the general 
scaling law ^ ex Nc / \/ Nh at low to moderate hydrogenation concentrations, albeit the "off" state exhibits a smaller 
proportionality constant. We note that in the "off" state the exponential decrease of G with L can be explained 
as evanescent transport through a clean band gap [20]; the scaling law f cx A cx 1/Eg is in agreement with this 
interpretation. In the "on" state instead, the exponential decrease of G with L is an effect of quantum localization 
due to disorder and ^ takes the meaning of a localization length [27 . The scaling law ^ cx A is here less clear and an 
exponential dependence could also be possible, as suggested by recent experiments [28]. Moreover, dephasing effects, 
which are ignored in our simulations, could restore a diffusive transport regime, G cx as the temperature is raised. 

Conclusions: In conclusion, a simple model for pattern- hydrogenated graphene was presented. Similar features 
are observed in the calculated /c-resolved density of states in energy and in the experimental ARPES. The scaling 
of the energy gap on the parameters Nc and Nh was presented, including its electronic transport properties at low 
temperature. Our results indicate that pattern- hydrogenated graphene is a promising approach to the engineering of 
graphene nanomeshes with extremely scaled cluster sizes and neck widths. 

Acknowledgement: The authors gratefully acknowledge support from Network for Computational Nanotechnol- 
ogy for computational services. 
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Appendix A: Triangular versus honeycomb superlattice 

While patterned hydrogenation has been experimentally demonstrated only for graphene on iridium, other substrate 
materials that can accomodate a monolayer graphene on their surface, e.g. rutenium [29], could in principle be used. 
The resulting superlattice can be of the triangular or honeycomb type. Let m and n be the size of the supercell 
in units of the graphene and substrate lattice constants, respectively. Here it is demonstrated that a honeycomb 
superlattice is obtained whenever m/n = (3M + 1)/(3A^) with M^N e 

Figs. [5^ and[5]3 show the unit cell for the honeycomb graphene lattice and the typical triangular substrate surface 
layer, respectively. According to the definition of m and n given above, we assume that an m x m graphene supercell 
is commensurate with an n x n substrate supercell. Therefore, denoting the lattice vectors of graphene, substrate, 
and superlattice by a^, bj, and Cj {j = 1,2), respectively, we have 

Cj — msij — nhj . (Al) 

Each pair {m^n) and its multiples map to a unique composite system, e.g. (4,3) is the same as (8,6). Hence, we 
consider only the case where m and n are prime to each other, so that Cj are the primitive lattice vectors of the 
superlattice. We further assume that, at some point O inside the supercell, a carbon atom sits directly on top of 
a substrate atom: such arrangement was found to be an energetically stable configuration [15]. We indicate the 
superlattice points that are equivalent to O as 5'^. 

As shown by Fig. [Sj^, in order to generate a honeycomb superlattice, there must be another point Sb inside the 
supercell where a carbon atom belonging to the opposite sublattice sits on top of a substrate atom. Also, for the 
superlattice to be regular, it can be derived that the distance between Sa and Sb must be equal to |ci +C2I/3. 
Inspecting the graphene lattice tells us that Sa and Sb have to be separated by a distance of (3M + l)acC7 where 
M G Z+ and acc is the carbon-carbon distance. Hence, we can write 

1 777/ 

:^ |ci +C2I = — |ai +a2| = (3M+ l)acc ^ m = 3M+l, M G Z+ . (A2) 

In a similar fashion for the substrate, the Sa and Sb have to be separated by a distance of Nass^ where N e and 
ass is the interatomic distance of the substrate. We can then write 

1 77 

3|ci+C2| = -|bi+b2|=iVass ^ n = 3N, NeZ+. (A3) 

In conclusion, the superlattice with the similar honeycomb structure as graphene, shown in Fig.jsj), can be generated 
by satisfying Eqs. |A2||A3l Otherwise, the superlattice would produce a triangular structure instead, with only repeated 
units of Sa- Examples of honeycomb superlattices are the cases m/n = 10/9 (graphene on iridium, also indicated as 
SLIO in the manuscript) and m/n = 13/12 (indicated as SL13 in the manuscript). 




FIG. 5: (a) Graphene unit cell: the two carbon atoms are indicated with different colors, (b) Substrate unit cell, (c) Honeycomb 
superlattice generated by the superposition of the graphene and substrate lattices. Sa (Sb) is the point inside the supercell 
where a carbon atom of the A (B) graphene sublattice sits on top of a substrate atom. 



FIG. 6: Best (a) and worst (b) cases for the probability of hydrogenation of a carbon atom of index i located at a distance 
di — ass/V^ from the nearest-neighbor substrate atoms. Carbon (substrate) atoms are represented with black (gray) balls. 



Appendix B: Model for patterned hydrogenation 



The model for generating the hydrogen clusters is described below. 

Given a sample of pristine graphene on a certain substrate, we divide the structure in supercells, the supercell 
subdivision being chosen so that each supercell contains one Sa and one Sb point in symmetric positions (see Fig. la 
of the main text for an illustration: the centers of the two circles correspond to an Sa or Sb point). Each carbon 
atom can be denoted by the pair of indexes (/,i), where / is the index of the half-supercell to which it belongs and 
i is the atom index inside the whole supercell. We introduce a binary random variable Zi^i G {0, 1} for each carbon 
atom: the atom is hydrogenated when Zi^i = 1. Zi^i in turn is written as the product of other two binary random 
variables Xi and F^, whose probability distributions are given below. 

Yi^ controls the cluster formation inside each supercell. We propose the following formula for the probability 
P{Yi = 1): 



P{Yi = l) = f{wi), f{wi=Q)=Q, ^>0, 

dwj 



(Bl) 



Wi being a quantity defined for each carbon atom as 



Wi 



3 ^{j) 



(B2) 



^cc 



where di is the x?/-plane distance between the carbon atom of index i and its nearest-neighbor substrate atom (let r 
be the position vector in the x7/-plane parallel to the surface), 



mm|rc, - r^J , 

k 



(B3) 



the summation over j is restricted to the three carbon atoms that are nearest neighbor to the carbon atom of index 
i, and c is simply a constant, 




(B4) 



Eqs. B1-B2 can be justified by the following considerations. Experimentally, the hydrogen clusters tend to form 
around the regions where one graphene sublattice is located on top of substrate atoms, while the other sublattice is 
far from substrate atoms and can bind to hydrogen atoms on the opposite face [7 . This translates in two conditions 
for the generic carbon atom to be hydrogenated. First, it should be located in between substrate lattice sites. The 
probability for adsorption whould then increase as its distance di from the nearest-neighbor substrate atom increases. 
This effect is captured by the prefactor in ( |B2[ ). However, if the considered carbon atom is located at the maximum 
distance from substrate atoms, equal to ass/>/3, the probability for adsorption should distinguish between the case 
in which the three nearest-neighbor carbon atoms are located close to substrate atoms (high probability. Fig. |6^) and 
the case in which also the three nearest neighbors are between substrate atoms (low probability. Fig. [gJd). We can 
note that in the first case di ^ \ while in the second case di ~ \ ^/j\dj. This explains the second factor in 



(B2), where the constant c serves only to set the probability to for the worst case (Fig. gJd). 

In (Bl) we have omitted the actual functional dependence of PiYi = 1) on Wi. Since this relationship depends on 
the physical hydrogenation process and it is unknown, we choose here a simple cut-off model. For a given superlattice 
unit cell, all the possible values of Wi are computed and labeled in decreasing order as w^'^\w^'^\ . . . (the location of 



FIG. 7: Supercell of graphene on iridium: location of the carbon atoms with the four largest values of Wi, i.e. Wi — 
Wi — w^'^\ etc. 



the corresponding carbon atoms is shown in Fig. [t] for the case of iridium substrate). Then, the probability P{Yi = 1) 
is assigned as 

{1 if Wi = w^^^ with j < Ny,, 
0.5 if^, =^(^) with j = 7V^, (B5) 
\iwi= w^^^ with j > N^. 

With this method, a cluster of hydrogen atoms is formed around the sites where the quantity Wi tends to grow (i.e. 
around Sa and 5'^). The input parameter controls the size of this cluster. The disorder is only located at the 
cluster edges. 

Xi^ instead, is used to generate the superlattice disorder, which consists in some hydrogen clusters being randomly 
missing from the superlattice. The probability P{Xi = 1) is set equal to the input parameter nc, with < nc < 1, 
which therefore assumes the meaning of the ratio between the average number of hydrogenated half-supercells (or 
equivalently average number of clusters) and the total number of half-supercells. 

The hydrogenation model described above produces clusters inside which only one graphene sublattice is hydro- 
genated. This leads to the formation of midgap states in the electronic structure, associated with dangling bonds. 
However, these states are believed to be an artefact of the TB model, due to the fact that bond relaxation is neglected. 
To avoid this, after hydrogen atoms are generated according to the method described above, a final step is introduced: 
additional hydrogen atoms are placed on top of the carbon atoms that have at least two nearest neighbors being 
hydrogenated. 

Appendix C: Calculation of momentum- energy resolved density of states 

We consider a sample composed of Ni and N2 graphene unit cells along the directions of ai and a2, respectively, 
with periodic boundary conditions on both directions. A^i and A^2 are chosen to be multiples of m, the size of the 
supercell in units of the graphene lattice constant, so that the sample contains exactly ^ x ^ supercells. The sample 
is then hydrogenated as described in the previous section. An example, obtained with A^i = N2 = 20, m/n = 10/9 
(graphene on iridium), Uc = 0.75, and = 4, is shown in Fig. [8|i. The actual samples that are simulated are much 
larger: Ni = N2 = 120 is used for the SLIO case, while Ni = N2 = 117 for the SL13 case. Such values have been 
checked to be large enough to ensure the convergence of A(k, k; E) (see next section). 

The generic orbital of the TB representation can be indicated as 11,^), where 1 is the graphene lattice vector, i.e. 
1 = UiQi with Hi = 1, . . . A/^, and q is the orbital index within the cell (g' = 1, 2 for the A and B carbon orbitals, 
respectively). This is the real space representation. However, one could also work in the /c-space representation, 
which is obtained by restricting q = 1^2 (projected space for carbon atoms only) and by using as basis the set {|k, g)} 
defined by (1, g'l |k, ^^2) = ^(?i,g2^^^" W^i^2- The k vectors are discrete because a finite volume is considered: if 
are the primitive vectors of the reciprocal lattice, i.e. • dj = 27TSij^ we have k = {mi/Ni)di + (^2/^2)d2, where 
mi, 1712 ^ ^ (it can be proved that the K point is included in the grid if both Ni and 7V2 are multiple of 3). Also, 
since the set of 1 vectors is discrete, it follows that only a number N1N2 of k vectors, spanning a Brillouin zone, give 
rise to independent basis vectors. 

Let H be the electron Hamiltonian. We recall that the spectral function at the energy E is defined as A = i{G^ — G^), 
where = [{E -\-ir])I — H]~'^ is the retarded Green's function and G^ = G^^ is the advanced one {r] is an infinitesimal 
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FIG. 8: (a) Example of sample used for calculating the momentum-energy resolved density of states. The structure is made of 
iVi X N2 graphene unit cells, with A^i and N2 being chosen so that the sample is a multiple of the super lattice unit cell (region 
enclosed by the green line). Periodic boundary conditions are applied on both the ai and a2 directions, (b) The same structure 
can be viewed as a linear chain oi N — 2N2 slabs (each slab corresponding to a row of atoms) with a periodic closure at the 
two ends. 



positive quantity). While the diagonal elements of A in real space have the meaning of a density of states in energy 
and physical space, its diagonal elements in /c-space give the density of states in energy and momentum, which in turn 
corresponds to the physical quantity measured by ARPES. The diagonal elements of A in /c-space can be expanded 
as 

A(k,^;k,^;^) = ^(k,^|li,^i)A(li,^i;l2,^2;^)(l2,^2|k,^) 
^ ^ ii 1 

where we recognize a discrete Fourier transform with respect to the relative variable 1 = li — I2. We define 



A(k, k; E) = Yl ^(^' ^5 k, E) , 



(C2) 
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i-1,i-1 ^i,i ^i+1,i+1 



FIG. 9: Pictorial representation of the renormalization method in the circles represent nodes (or slabs) and the arrows 
the different types of coupling between them. The method consists in eliminating node i, while leaving unchanged the solution 
for all the remaining nodes by a proper renormalization of the matrix blocks of Q. 



where Ag = aQQ3V3/2 is the area of the graphene unit cell, so that A(k, k; £^)/(27r) gives the number of states per 
unit energy, per unit k, and per graphene unit cell (apart from spin degeneracy). The calculation of A(k, k; E) is thus 
performed by first computing the spectral function in real space and then Fourier transforming according to ( C1|C2 ). 

A method to compute A in real space is by diagonalization of H. Indeed, if are the orthonormal eigenstates 

of H with corresponding eigenvalues {cq,}, we have 



A(li,^i; 12,^2;^) 



E 



2ri 



M\uqiWa{h,q2), (C3) 



with Tpcxi}^ q) = {l^q\tpa) the generic eigenfunction in real space. The numerical computation of (C3) can be efficiently 



done by setting a finite value of r] and by finding for each energy E the eigenvalues (and corresponding eigenvectors) 
that are closest to it, using well known methods for large and sparse eigenvalue problems [40]. Nevertheless, we propose 
here an alternative method based on Green's functions. Although in the case considered here our method does not 
improve the computational time with respect to the diagonalization technique, because almost all the off-diagonal of 
the spectral function in real space need to be calculated, the method could be interesting in other situations, such as 
for the calculation of the local density of states in real space, where only few elements of the spectral function are 
needed. The method could also be useful when the storage of the eigenvectors is a major problem. 

As illustrated in Fig. [8^, we divide the sample in N = 2N2 slabs along the a2 direction so that each slab corresponds 
to a row of atoms (the choice between ai and a2 is arbitrary). The structure is therefore of the type in Fig. [sJd: a 
linear chain of slabs with a periodic boundary conditions at the two ends. Using block matrix notation in real space, 
Q = {E -\- ir])I — H has the form 



/ ^1,1 ^1,2 
^2,1 ^2,2 ^2,3 



1,N 



\ 



3,2 



^N,N-1 ^N,N ) 



(C4) 



where each block represents the coupling between a pair of slabs. We notice that only the elements of the s pectral 
function that connect orbitals belonging to the same graphene sublattice (i.e. same q) are needed in ([Cl][C2|. From 
Fig. [8^, it can be seen that there is a correspondence between g = 1,2 and the odd and even slabs, respectively. 
Therefore, only the matrix block of A (and thus of G^) connecting slabs with the same parity need to be calculated. 
In order to avoid the calculation of the unnecessary matrix blocks, the renormalization method [30 is employed: an 
equivalent 1]°^^ (1^®^®") matrix for the odd (even) slabs alone is computed by decimating the even (odd) ones. We 
recall that the decimation of a single node i from a linear chain, as depicted in Fig. |9j is achieved by renormalizing 
the matrix blocks of Vt for the adjacent nodes according to the formulas [30] : 

^i — 1 — ^i— 1 ^i— — 1 i 
^i — l,i+l ^i — ? 



By repeated use of (C5), the following algorithm can be derived to compute the renormalized matrix Q"^^^ for the 
odd slabs alone (Fig. |10|): 
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odd 
N/2,1 




FIG. 10: Same structure as in Fig. 



D, where the even slabs have been decimated. 



N 



FIG. 11: Structure corresponding to g 



1. for z = 1, . . . , N/2 initialize 



odd 



2i-l,2i-l ; 



(C6) 



2. for i = l,...,N/2 

— set j = k = mod(z, y] 

— compute 



1, / = mod(j,7V) + 1 



Qodd _ Qodd _ o {(^ ^"^0 



k,k k,k 



(C7) 



A similar algorithm can be derived for 1^^^^". Both and have the same shape as ft (Eq. C4L but with 

N N/2. In the following, we will therefore refer to Q for brevity, implicitly assuming that what we say must be 
applied separately to 1]°^^ and 1^^^^". 

The retarded Green's function at the energy E is obtained by inverting Q. Here we present an algorithm for 
recursively calculating the blocks of G^, extending the one in [31] for the case ofr^i^Ar,^Ar,i 7^ 0. The algorithm is based 
on Dyson's equations. We recall that if the Hamiltonian is expressed as H = Hq + Hi {Hq is called the unperturbed 
Hamiltonian and Hi the perturbation one), the Dyson equations are given by = Gq -\- GqHiG^ = Gq + G^i^iGg, 
where Gq is the retarded Green's function corresponding to Hq. While the formulas that are presented here directly 
exploit time reversal symmetry, i.e. the fact that G^ = (G^)-^, their extension to the general case is straightforward. 

The first part of the algorithm consists in the calculation of certain blocks of for i = 1, . . . , A^, where 

is the retarded Green's function corresponding to the structure composed of only the nodes from z to without the 
periodic closure (Fig. pT]). Indeed, by applying Dyson's equations to the structure in Fig. 11 with the coupling blocks 
—fli^i-^i and —fti-^i^i treated as the perturbation Hamiltonian, it is possible to relate ^ '"^^ to and derive 

the following equations: 

1. initialize 



9n,n ^ ~ 



AT, at; 



(C8) 
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2. for i = — 1, . . . , 1 compute 



r7^,(^) _ ^ r7^,(^+l)^ y 

r7^,(^ 



r7^,(^) _ r7^,(^+l) _ r7^,(^^ 



V,i+1 ^^i+l,ii/i,Ar 



(C9) 
(CIO) 
(Cll) 
(C12) 



The blocks g^j^^^^ for i > 1 can be discarded. The second part of the algorithm is obtained by applying again Dyson's 
equations, but to the original structure in Fig. [8]3, with the perturbation Hamiltonian given by the coupling blocks 
—Qi^i^i, — ^Ar,i, and —^i^n- The formulas are as follows 

3. initialize 



r7^,(l)o 



rn,{l) ( J , o r7^,(l)^"-^^ r7^,(l) 



r7^,(l) 
^1,1 



4. for z = l,...,Ar 

— if z > 1 compute 



9i 



for j 



1, . . . , A/" compute 



{Gl,f 



r7^,(^)Q 



(C13) 



(C14) 

(C15) 
(C16) 



As soon as each block of is computed, its contribution to (CI) can be evaluated and the block is ready for being 
discarded unless is used later by the algorithm. It can be checked that the first row of G^ has to be fully saved, while 
G[,^ is needed to calculate G^^j^^ and G[^+i to calculate G[^x,i+i- conclusion, compared to the direct inversion 



of the matrix in (C4), the proposed algorithm allows to reduce of about a factor of 4 the number of blocks that are 



calculated, due to both the decimation of nodes with different parity and the exploitation of time reversal symmetry; 
moreover, the blocks are recursively calculated one after the other, thus saving memory. We note that our algorithm 
could also be used for partial inversion of (C4), as in the case of the calculation of the density of states in energy, 
where only the diagonal elements of the spectral function in real space are required: in this case, for i > 1, ( [C15l ) 
can be limited to j = z + 1 and (C16) is not required. Regarding the stability of the overall algorithm, it should be 
pointed out that a value of 77 = 10~^ eV was necessary in the simulations to avoid numerical artefacts; however, the 
corresponding broadening introduced in the A(k, k; E) plot is way much smaller than the one due to disorder. 



Appendix D: Convergence study w.r.t. sample size (and ensemble size) 

In order to account for disorder across different cells of the superlattice, each sample has to be large enough so 
that it contains a sufficient number of clusters and the effect of the periodic boundary condition is washed out. In 
addition, the size of the sample determines the discretization step in /c-space (the total number of k points along the 
red line in the inset of Fig. Ic of the main text is equal to N2): for a high-quality plot of A(k, k; E)^ each sample has 
to be large enough so that the grid in /c-space is sufficiently fine. 



In Figs. 12, we report a comparison of A(k, k; plots obtained by varying the sample size or the ensemble size. 
For the case with TVi x A^2 = 30 x 30 and 20 samples (Fig. p!2^), the number of k points is small and the dispersion 
looks quite vague (also note that this a zoomed view around the K point, so that only about 1/4 out of the N2 
points along the path are shown). Interestingly, the number of clusters seems to be already large enough to destroy 
the superlattice band structure: no repeated bands are visible, while the states are clearly arising from the graphene 



Dirac cone. The plot greatly improves when the size of the samples is increased to 90 x 90 (Fig. 12 3). However, 



further incresing the sample size (Fig. [T2p) or the number of samples (Fig. 12 i), gives only a slight improvement, 
which means that the result has already well converged. 
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k.(A-^) k,(A-^) k,(A') k,(A-^) 

FIG. 12: Averaged A(k, k; E) for ensembles with different sample size or ensemble size, all generated with the same supercell 
size (SLIO), filling factor (ric = 0.75), and cluster size (Nu, = 4): (a) A^i x A^s = 30 x 30, 20 samples; (b) 90 x 90, 20 samples; 
(c) 120 X 120, 20 samples; (d) 120 x 120, 50 samples. The color scale and path in /c-space are the same as in the main text. 



Appendix E: Procedure for band-gap extraction 

The band gap is extracted from each (ensemble-averaged) A(k, k; E) plot using a fitting technique. We recall that 
the path in /c-space is the one shown in the inset of Fig. Ic in the main text, so that k = (kx-,Ky)^ where Ky is the 
/c^y -coordinate of the K point. Since ky is fixed, we use the simplified notation A{kx^ k^'-, E). The fitting procedure is 
composed of the following steps. 

1. Manually choose a range of energies [Ei^E2] where to apply the fitting. 

2. Find for each energy E G [Ei^E2] the kx coordinate where the intensity is maximum, separately for positive 
and negative kx'. 

k^{E) such that A{k^,k^;E) = mdixA{kx,kx]E), (El) 

kx>0 

k~{E) such that A{k~ ,k~ ; E) = mdi-KA{kx,kx; E) . (E2) 

3. Compute for each E G [Ei^E2] the values w~^{E) and w~{E) as follows 

w+(E) = A{k+{E),kt{E);E) 

me^E'e[E„E,]A{k+{E),k+{E);E')' 
.-iE) = Aik,iE)k,iE);E) ^^^^ 

m^-yiE'e[E^,E2] ^ (E), kx [E); E') 

4. Apply a least-square fitting to the set of points {{E^ k^ {E))} [J {{E ^ k~{E))} with E G [Ei^E2]^ by using w~^{E) 
and w~{E) as weights and one of the following dispersion relations as fitting curve: 

E = ±(^hv\k,\ + ^^ , (E5) 




FIG. 13: Plot of averaged A(k, k; E) and fitting curve, for all the set of samples studied in this work, (i) and (1) are the same 
plots as in Fig. Ic of the manuscript, (j) and (k) the same as in Fig. 2a of the manuscript. 
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1 


2 


0.3 


0.7 
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0.087 


0.098 


0.22 
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1 


3 


0.35 


0.7 


L 


0.868 


0.393 




0.098 


0.19 
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0.35 


0.8 


LP 
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0.192 





0.23 



TABLE L 

Parameters of tfie A(k, k; E) fitting and broadening extraction. 



The result of the fitting for negative E is shown in Fig. 13, superimposed to the original A(k, k; E) plot. 

For each 74(k, k;£^) plot, a measure of the broadening is also extracted. We consider a specific kx value, /c^, and 
compute the quantity 2B as the difference between the two energies at which the function A{k^ ^ k^] E) decreases to 
half of its maximum value. 

The input and output parameters of the band -gap and broadening extraction are collected in Table |l] for each set of 
sample: L, P, and LP refer to the fitting curves ( |E5[ ), (E6) and (E7), respectively, vp = (3/2)acc|7|/^ is the graphene 
Fermi velocity, and mo is the electron rest mass. 

The use of different fitting curves deserves an explanation. We notice that relation ( |E7[ ) is the most physical one 
since it describes the ID quantization of the graphene dispersion relation. However, for Ny^ = 3, i.e. cases (b), (e), 
(h), and (k) of Fig. [l3] and Table |l) the averaging effect of disorder at the cluster edges seems to be not strong enough 
to reach the typical dispersion relation (in fact, the repeated bands of the superlattice are still slightly visible in the 
A(k, k;£^) plot), and the functional dependence in ( |E5[ ) gives a better fitting. Also, when then parabolicit y is large 
in the energy range of interest, such as in cases (i) and (1) of Fig. [l3] and Table |l) the use of rtE6h instead of rtE7h does 
not make a significant difference in the gap value. 



Appendix F: Transport calculation 



We consider a structure such as the one represented in Fig. [Mp. The structure is divided in slabs along the 
longitudinal direction so that each slab corresponds to an atomic row. The slabs inside the device region are numbered 
from 1 to A^. 

The zero temperature conductance G is given by the transmission function T{E) at the Fermi energy Ep^ 

G = ^T{E = Ef) , (Fl) 

where in turn the transmission function is computed through Green's functions as [24] 

T{E) = Tr [F^G^F^G^] . (F2) 

In this equation, F^/^ = i(I]^'^/^ — H^'^/^) is the broadening function due to the left/right lead, where J]^'^/^ is 
the self-energy representing the renormalization of the Hamiltonian of the device region alone due to the presence of 
the semi- infinite left /right lead, and S^'^/^ = (S^'^/^)t. Since the only non-null block of S^'^ is S^'^ and the only 



non-null one of E^'^ is S^^, (F2) can be rewritten as 

r(^) = TV [rf,iGi,^r^_^G^,i] . (F3) 

The calculation of the G^ blocks can be efficiently performed using well known methods, such as the already mentioned 
recursive algorithm [31|, or a combination of the recursive and the renormalization algorithms ^2], and therefore it 
is not treated here. Instead, we focus on the calculation of the lead self-energies. 
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We notice that in our case the unit ceh of each lead is made of M = 4 slabs (Fig. [l4]3). Here, we propose an 
algorithm, based on the renormalization method [30 , to reduce the size of the unit cell to only one slab, such that the 
usual Sancho-Rubio algorithm [25] can then be applied on matrices having a reduced size, thus saving computational 
time. We notice that, in the specific case considered in this work, analytical espressions for the self-energies could 
have been used [33 . However, the numerical technique presented here is more general: for example, it can also be 
applied in the presence of a magnetic field. 

We consider only the case of a left lead, the generalization to the right case being straightforward. The self-energy 
due to the left lead is defined as 

S?,! = ^^1,055,0^^0,1 , (F4) 

where Q = {E -\- ir])I — H and g'^ is the retarded Green's function for the case in which the coupling between the 
device and the leads is set to zero [24 . Suppose that the unit cell of the lead contains M slabs. The matrix of 
the isolated left lead has thus the structure 




X 



(b) 



-M 


unit cell 


1 


1 1 
-M+1 






















left lead 

H 


^ device 

1,0 



FIG. 14: (a) Example of device used in transport simulations: hydrogenated channel of size W x L between two leads of pristine 
graphene. The device is aligned so that its longitudinal direction corresponds to an armchair direction, (b) Slab representation 
of the same structure. The unit cell inside the lead regions can be viewed as being made of M = 4 slabs, where each slab 
corresponds to a row of carbon atoms. 
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with 



I ^-M+l,-M+l ^-M+l,-M+2 
^-M+2,-M+l 

V 



^-1,-1 ^-1,0 
^0,-1 ^0,0 / 



(F6) 



As a first step, we consider Vf^ and decimate all the slabs from —1 backward to —M + 2 (assuming M > 2). We 

define = 1^0,0, = a^^^ = ^^^^ = ^o,-i- The generic iteration of index n (n = 1, . . . , M — 2) 

consists in eliminating the second last node from the matrix 



with the equations 



• ^ — n— 1, — n— 1 n— 1, — n 



4") = (4"-i))"'a("-i), 

0-2 — 1,— n— 1 — n— 1, — n I "-2 j 

= -f}_„_i,_„(4"-^^)"'a("-i), 



(F7) 



(F8) 



which are simply an application of (C5). At the end, we obtain the renormalized matrix 



, 4^-2) ^(M-2) 



4M-2) 



■M,-M+l 
(M-2) 
'2 



a 



V 



(M-2) 
^(M-2) ^(M-2) y 



(F9) 



As a second step, we consider and decimate all the even slabs (assuming M > 1). By using the formulas (again 
an application of Eqs. C5 ) 



5i°^ = d- 



(M-2) 



.(0) 



v(0) 



^-M+l,-M ■ 
(M-2) 



.^(M-2) (^4^-2) j \(M-2)^ 

^ /.(M-2)\""^ 
-^^-M,-M+l I <^2 ) 

we get a new renormalized Vt^ matrix, 

(■■ ^ 

" = a(0) 

4°) a(0) 



(FIO) 



(Fll) 
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This matrix has the same structure as the one used in the Sancho-Rubio algorithm [25^. The generic iteration of index 
n (n = 1, 2, . . .) of this algorithm actually consists in the decimation of the slabs with even indexes from the matrix 



(„_1) An-l) ^(n-l) 



V 



/3("-i) 4""'^ a^""'^ 

^(n-l) ^("-1) J 



(F12) 



by using the formulas (again from Eqs. C5 ) 



4") 


= 4"-^) - 


An) 


- ,5("-^^ - 

— "2 




= -a("-i) 


0(n) 


= _«(»-!) 



(F13) 



until convergence, i.e. until the coupling matrices a^^^ and P^^^ become sufficiently small. At the end, we can 
approximate q = {^^^^ , where n stands for the index of the last iteration. 
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